<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>Requirements and Functionality &mdash; cuFFTDx 1.0.0 documentation</title>
      <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
      <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
      <link rel="stylesheet" href="_static/cufftdx_override.css" type="text/css" />
  <!--[if lt IE 9]>
    <script src="_static/js/html5shiv.min.js"></script>
  <![endif]-->
  
        <script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
        <script src="_static/jquery.js"></script>
        <script src="_static/underscore.js"></script>
        <script src="_static/doctools.js"></script>
    <script src="_static/js/theme.js"></script>
    <link rel="index" title="Index" href="genindex.html" />
    <link rel="search" title="Search" href="search.html" />
    <link rel="next" title="cuFFTDx API Reference" href="api/index.html" />
    <link rel="prev" title="Achieving high performance" href="performance.html" /> 
</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search" > 
            <a href="index.html" class="icon icon-home"> cuFFTDx
          </a>
              <div class="version">
                1.0.0
              </div>
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>

  <style>
    /* Sidebar header (and topbar for mobile) */
    .wy-side-nav-search, .wy-nav-top {
      background: #76b900;
    }

    .wy-side-nav-search a:link, .wy-nav-top a:link {
      color: #fff;
    }
    .wy-side-nav-search a:visited, .wy-nav-top a:visited {
      color: #fff;
    }
    .wy-side-nav-search a:hover, .wy-nav-top a:hover {
      color: #fff;
    }

    .wy-menu-vertical a:link, .wy-menu-vertical a:visited {
      color: #d9d9d9
    }

    .wy-menu-vertical a:active {
      background-color: #76b900
    }

    .wy-side-nav-search>div.version {
      color: rgba(0, 0, 0, 0.3)
    }

    /* override table width restrictions */
    .wy-table-responsive table td, .wy-table-responsive table th {
        white-space: normal;
    }

    .wy-table-responsive {
        margin-bottom: 24px;
        max-width: 100%;
        overflow: visible;
    }
  </style>
  
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <ul>
<li class="toctree-l1"><a class="reference internal" href="index.html">Documentation home</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">User guide:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="introduction.html">First FFT using cuFFTDx</a><ul>
<li class="toctree-l2"><a class="reference internal" href="introduction.html#what-next">What next?</a></li>
<li class="toctree-l2"><a class="reference internal" href="introduction.html#compilation">Compilation</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="introduction.html#your-next-custom-fft-kernels">Your next custom FFT kernels</a><ul>
<li class="toctree-l2"><a class="reference internal" href="introduction.html#what-happens-under-the-hood">What happens under the hood?</a></li>
<li class="toctree-l2"><a class="reference internal" href="introduction.html#why">Why?</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="performance.html">Achieving high performance</a><ul>
<li class="toctree-l2"><a class="reference internal" href="performance.html#general-advice">General advice</a></li>
<li class="toctree-l2"><a class="reference internal" href="performance.html#memory-management">Memory management</a></li>
<li class="toctree-l2"><a class="reference internal" href="performance.html#kernel-fusion">Kernel fusion</a></li>
<li class="toctree-l2"><a class="reference internal" href="performance.html#advanced">Advanced</a></li>
<li class="toctree-l2"><a class="reference internal" href="performance.html#further-reading">Further reading</a><ul>
<li class="toctree-l3"><a class="reference internal" href="performance.html#references">References</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Requirements and Functionality</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#requirements">Requirements</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#supported-compilers">Supported Compilers</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#supported-functionality">Supported Functionality</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="api/index.html">cuFFTDx API Reference</a><ul>
<li class="toctree-l2"><a class="reference internal" href="api/operators.html">Operators</a><ul>
<li class="toctree-l3"><a class="reference internal" href="api/operators.html#description-operators">Description Operators</a><ul>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#size-operator">Size Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#direction-operator">Direction Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#type-operator">Type Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#precision-operator">Precision Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#sm-operator">SM Operator</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="api/operators.html#execution-operators">Execution Operators</a><ul>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#thread-operator">Thread Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#block-operator">Block Operator</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/operators.html#block-configuration-operators">Block Configuration Operators</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="api/traits.html">Traits</a><ul>
<li class="toctree-l3"><a class="reference internal" href="api/traits.html#description-traits">Description Traits</a><ul>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#size-trait">Size Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#type-trait">Type Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#direction-trait">Direction Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#precision-trait">Precision Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#is-fft-trait">Is FFT? Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#is-fft-execution-trait">Is FFT Execution? Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#is-fft-complete-trait">Is FFT-complete? Trait</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#is-fft-complete-execution-trait">Is FFT-complete Execution? Trait</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="api/traits.html#execution-traits">Execution Traits</a><ul>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#thread-traits">Thread Traits</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/traits.html#block-traits">Block Traits</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="api/methods.html">Execution Methods</a><ul>
<li class="toctree-l3"><a class="reference internal" href="api/methods.html#thread-execute-method">Thread Execute Method</a></li>
<li class="toctree-l3"><a class="reference internal" href="api/methods.html#block-execute-method">Block Execute Method</a><ul>
<li class="toctree-l4"><a class="reference internal" href="api/methods.html#value-format">Value Format</a></li>
<li class="toctree-l4"><a class="reference internal" href="api/methods.html#input-output-data-format">Input/Output Data Format</a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="api/methods.html#make-workspace-function">Make Workspace Function</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="release_notes.html">Release Notes</a><ul>
<li class="toctree-l2"><a class="reference internal" href="release_notes.html#id1">1.0.0</a><ul>
<li class="toctree-l3"><a class="reference internal" href="release_notes.html#new-features">New Features</a></li>
<li class="toctree-l3"><a class="reference internal" href="release_notes.html#resolved-issues">Resolved Issues</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="release_notes.html#id2">0.3.1</a><ul>
<li class="toctree-l3"><a class="reference internal" href="release_notes.html#known-issues">Known Issues</a></li>
</ul>
</li>
</ul>
</li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="index.html">cuFFTDx</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">
      <li><a href="index.html" class="icon icon-home"></a> &raquo;</li>
      <li>Requirements and Functionality</li>
      <li class="wy-breadcrumbs-aside">
      </li>
  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <div class="section" id="requirements-and-functionality">
<span id="requirements-label"></span><h1>Requirements and Functionality<a class="headerlink" href="#requirements-and-functionality" title="Permalink to this headline">¶</a></h1>
<hr class="docutils" />
<div class="section" id="requirements">
<h2>Requirements<a class="headerlink" href="#requirements" title="Permalink to this headline">¶</a></h2>
<p>The cuFFTDx library is a CUDA C++ header only library. Therefore, the list of required software to use the library is relatively small. User needs:</p>
<ul class="simple">
<li><p>CUDA Toolkit 11.0 or newer</p></li>
<li><p>Supported CUDA compiler</p></li>
<li><p>Supported host compiler (C++17 required)</p></li>
<li><p>(Optionally) CMake (version 3.18 or greater)</p></li>
</ul>
<div class="section" id="supported-compilers">
<h3>Supported Compilers<a class="headerlink" href="#supported-compilers" title="Permalink to this headline">¶</a></h3>
<p><strong>CUDA Compilers:</strong></p>
<ul class="simple">
<li><p>NVCC 11.0.194+ (CUDA Toolkit 11.0 or newer)</p></li>
<li><p>(Experimental support) NVRTC 11.0.194+ (CUDA Toolkit  11.0 or newer)</p></li>
</ul>
<p><strong>Host / C++ Compilers:</strong></p>
<ul class="simple">
<li><p>GCC 7+</p></li>
<li><p>Clang 9+ (only on Linux/WSL2)</p></li>
<li><p>Compiling with MSVC (Windows) is not supported</p></li>
</ul>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>cuFFTDx emits errors for unsupported versions of compilers, which can be silenced by defining <code class="code highlight cpp docutils literal notranslate"><span class="n"><span class="pre">CUFFTDX_IGNORE_DEPRECATED_COMPILER</span></span></code>
during compilation. cuFFTDx is not guaranteed to work with versions of compilers that are not supported in cuFTTDx.</p>
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>cuFFTDx emits errors for unsupported versions of C++ standard, which can be silenced by defining <code class="code highlight cpp docutils literal notranslate"><span class="n"><span class="pre">CUFFTDX_IGNORE_DEPRECATED_DIALECT</span></span></code>
during compilation. cuFFTDx is not guaranteed to work with versions of C++ standard that are not supported in cuFTTDx.</p>
</div>
</div>
</div>
<div class="section" id="supported-functionality">
<span id="functionality-label"></span><h2>Supported Functionality<a class="headerlink" href="#supported-functionality" title="Permalink to this headline">¶</a></h2>
<dl class="simple">
<dt>Supported functions include:</dt><dd><ul class="simple">
<li><p>Create block descriptors that run collective FFT operations (with one or more threads collaborating to compute one or more FFTs) in a single CUDA block. See <a class="reference internal" href="api/operators.html#block-operator-label"><span class="std std-ref">Block Operator</span></a>.</p></li>
<li><p>Create thread descriptors that run a single FFT operation per thread. This function might require more expertise with cuFFTDx in order to obtain correct results with higher performance. See <a class="reference internal" href="api/operators.html#thread-operator-label"><span class="std std-ref">Thread Operator</span></a>.</p></li>
<li><p>Bi-directional information flow, from the user to the descriptor via <a class="reference internal" href="api/operators.html#operators-label"><span class="std std-ref">Operators</span></a> and from the descriptor to the user via <a class="reference internal" href="api/traits.html#traits-label"><span class="std std-ref">Traits</span></a>.</p></li>
<li><p>Target specific GPU architectures using the <a class="reference internal" href="api/operators.html#sm-operator-label"><span class="std std-ref">SM Operator</span></a>. This enables users to configure the descriptor with suggested parameters to target performance.</p></li>
</ul>
</dd>
</dl>
<p>cuFFTDx supports selected FFT sizes in the range <code class="code highlight cpp docutils literal notranslate"><span class="p"><span class="pre">[</span></span><span class="mi"><span class="pre">0</span></span><span class="p"><span class="pre">;</span></span><span class="w"> </span><span class="n"><span class="pre">max_size</span></span><span class="p"><span class="pre">]</span></span></code> and all sizes in the range <code class="code highlight cpp docutils literal notranslate"><span class="p"><span class="pre">[</span></span><span class="mi"><span class="pre">0</span></span><span class="p"><span class="pre">;</span></span><span class="w"> </span><span class="n"><span class="pre">max_size</span></span><span class="o"><span class="pre">/</span></span><span class="mi"><span class="pre">2</span></span><span class="p"><span class="pre">]</span></span></code>, where <code class="code highlight cpp docutils literal notranslate"><span class="n"><span class="pre">max_size</span></span></code> depends on precision, type,
and CUDA architecture. However, not every combination of size, precision, elements per thread, and FFTs per block is correct and available. The following
table summarizes the available configurations:</p>
<table class="docutils align-default">
<colgroup>
<col style="width: 25%" />
<col style="width: 13%" />
<col style="width: 33%" />
<col style="width: 16%" />
<col style="width: 14%" />
</colgroup>
<tbody>
<tr class="row-odd"><td rowspan="2"><p>Type</p></td>
<td rowspan="2"><p>Precision</p></td>
<td rowspan="2"><p>Thread FFT Sizes</p></td>
<td colspan="2"><p>Block FFT Sizes</p></td>
</tr>
<tr class="row-even"><td><p>Architecture</p></td>
<td><p>Size Range</p></td>
</tr>
<tr class="row-odd"><td rowspan="9"><ul class="simple">
<li><p>Complex-to-complex</p></li>
<li><p>Real-to-complex</p></li>
<li><p>Complex-to-real</p></li>
</ul>
</td>
<td rowspan="3"><p>half</p></td>
<td rowspan="3"><p>All sizes in range: [2; 32]</p></td>
<td><p>75</p></td>
<td><p>[2; 4096]</p></td>
</tr>
<tr class="row-even"><td><p>70;72;86</p></td>
<td><p>[2; 16384]</p></td>
</tr>
<tr class="row-odd"><td><p>80</p></td>
<td><p>[2; 32768]</p></td>
</tr>
<tr class="row-even"><td rowspan="3"><p>float</p></td>
<td rowspan="3"><p>All sizes in range: [2; 32]</p></td>
<td><p>75</p></td>
<td><p>[2; 4096]</p></td>
</tr>
<tr class="row-odd"><td><p>70;72;86</p></td>
<td><p>[2; 16384]</p></td>
</tr>
<tr class="row-even"><td><p>80</p></td>
<td><p>[2; 32768]</p></td>
</tr>
<tr class="row-odd"><td rowspan="3"><p>double</p></td>
<td rowspan="3"><p>All sizes in range: [2; 16]</p></td>
<td><p>75</p></td>
<td><p>[2; 2048]</p></td>
</tr>
<tr class="row-even"><td><p>70;72;86</p></td>
<td><p>[2; 8192]</p></td>
</tr>
<tr class="row-odd"><td><p>80</p></td>
<td><p>[2; 16384]</p></td>
</tr>
</tbody>
</table>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>cuFFTDx 0.3.0 added preliminary support for all sizes in range of <code class="code highlight cpp docutils literal notranslate"><span class="p"><span class="pre">[</span></span><span class="mi"><span class="pre">0</span></span><span class="p"><span class="pre">;</span></span><span class="w"> </span><span class="n"><span class="pre">max_size</span></span><span class="o"><span class="pre">/</span></span><span class="mi"><span class="pre">2</span></span><span class="p"><span class="pre">]</span></span></code>. Most sizes will require you to create additional workspace with global memory allocation. See <a class="reference internal" href="api/methods.html#make-workspace-method-label"><span class="std std-ref">Make Workspace Function</span></a>
for more details about workspace. You can check if a given FFT requires with <a class="reference internal" href="api/traits.html#requiresworkspace-block-trait-label"><span class="std std-ref">FFT::requires_workspace</span></a> trait.</p>
</div>
<blockquote>
<div><p>Workspace is not required for FFTs of following sizes:</p>
<ul class="simple">
<li><p>Powers of 2 up to 32768</p></li>
<li><p>Powers of 3 up to 19683</p></li>
<li><p>Powers of 5 up to 15625</p></li>
<li><p>Powers of 6 up to 1296</p></li>
<li><p>Powers of 7 up to 2401</p></li>
<li><p>Powers of 10 up to 10000</p></li>
<li><p>Powers of 11 up to 1331</p></li>
<li><p>Powers of 12 up to 1728</p></li>
</ul>
<dl class="simple">
<dt>In the future versions of cuFFTDx:</dt><dd><ul class="simple">
<li><p>Workspace requirement may be removed for other configurations.</p></li>
<li><p>FFT configurations that do not require workspace will continue to do so.</p></li>
</ul>
</dd>
</dl>
</div></blockquote>
<dl class="simple">
<dt>Functionality not yet supported include:</dt><dd><ul class="simple">
<li><p>Input/output stored in global memory. Input data must be in registers (local memory) or shared memory.</p></li>
<li><p>The <a class="reference internal" href="api/operators.html#blockdim-operator-label"><span class="std std-ref">BlockDim Operator</span></a>, which enables fine-grain customization of the CUDA block dimensions.</p></li>
</ul>
</dd>
</dl>
</div>
</div>


           </div>
          </div>
          <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
        <a href="performance.html" class="btn btn-neutral float-left" title="Achieving high performance" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
        <a href="api/index.html" class="btn btn-neutral float-right" title="cuFFTDx API Reference" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
    </div>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 2022, NVIDIA Corporation.</p>
  </div>

  Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
    <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
    provided by <a href="https://readthedocs.org">Read the Docs</a>.
   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script>  

  <style>
  a:link, a:visited {
    color: #76b900;
  }

  a:hover {
    color: #8c0;
  }

  .rst-content dl:not(.docutils) dt {
    background: rgba(118, 185, 0, 0.1);
    color: rgba(59,93,0,1);
    border-top: solid 3px rgba(59,93,0,1);
  }
  </style>
  

</body>
</html>